Exercise 1 (4 points)

Print the 6 first lines of the R-built-in data.frame trees

##   Girth Height Volume
## 1   8.3     70   10.3
## 2   8.6     65   10.3
## 3   8.8     63   10.2
## 4  10.5     72   16.4
## 5  10.7     81   18.8
## 6  10.8     83   19.7

Print only the column names

## [1] "Girth"  "Height" "Volume"

What is the dimension of trees?

## [1] 31  3

Plot the trees height and volume as a function of their girth in two different graphs. Make sure the axis labels are clear

In each graph, add a red dashed line corresponding to the relevant correlation that you observe (average value, linear correlation…)

Explain your choice and write the corresponding values (average value and standard deviation, or slope, intercept and corresponding errors). Round the values to 2 decimals.

The average height is 76 with standard deviation 6.37.

The volume evolves with a slope 5.07 ± 0.25 and intercept -36.94 ± 3.37.


Exercise 2 (6 points)

Print the 3 first lines of the R-built-in data.frame USArrests. This data set contains statistics about violent crime rates by US state. The numbers are given per 100 000 inhabitants, except for UrbanPop which is a percentage.

##         Murder Assault UrbanPop Rape
## Alabama   13.2     236       58 21.2
## Alaska    10.0     263       48 44.5
## Arizona    8.1     294       80 31.0

What is the average murder rate in the whole country?

## [1] 7.788

What is the state with the highest assault rate?

## [1] "North Carolina"

Create a subset of USArrests gathering the data for states with an urban population above (including) 80%.

How many states does that correspond to?

## [1] 12

Within these states, what is the state with the smallest rape rate?

## [1] "Rhode Island"

Print this subset ordered by decreasing urban population.

##               Murder Assault UrbanPop Rape
## California       9.0     276       91 40.6
## New Jersey       7.4     159       89 18.8
## Rhode Island     3.4     174       87  8.3
## New York        11.1     254       86 26.1
## Massachusetts    4.4     149       85 16.3
## Hawaii           5.3      46       83 20.2
## Illinois        10.4     249       83 24.0
## Nevada          12.2     252       81 46.0
## Arizona          8.1     294       80 31.0
## Florida         15.4     335       80 31.9
## Texas           12.7     201       80 25.5
## Utah             3.2     120       80 22.9

Print this subset ordered by decreasing urban population and increasing murder rate.

##               Murder Assault UrbanPop Rape
## California       9.0     276       91 40.6
## New Jersey       7.4     159       89 18.8
## Rhode Island     3.4     174       87  8.3
## New York        11.1     254       86 26.1
## Massachusetts    4.4     149       85 16.3
## Hawaii           5.3      46       83 20.2
## Illinois        10.4     249       83 24.0
## Nevada          12.2     252       81 46.0
## Utah             3.2     120       80 22.9
## Arizona          8.1     294       80 31.0
## Texas           12.7     201       80 25.5
## Florida         15.4     335       80 31.9

Plot an histogram of the percentage of urban population with a binning of 5%. Add a vertical red line marking the average value. Make sure the x axis shows the [0,100] range.

Is there a correlation between the percentage of urban population and the various violent crime rates? argument your answer with plots.

No clear correlation appears.


Exercise 3 (10 points)

In high-pressure experiments, the pressure in the Diamond Anvil Cell (DAC) is calibrated through the measure of the Raman shift of a tiny ruby crystal placed in the pressure transmitting medium next to the measured sample.

Write a function returning the pressure \(P\) as a function of the ruby Raman shift position \(\omega\) and the excitation laser wavelength \(\lambda_l\): \[P(\omega, \lambda_l) = \frac{A}{B}\left[\left(\frac{\lambda}{\lambda_0}\right)^B-1\right] (GPa)\] where \(A=1876\) and \(B=10.71\), \(\lambda\) is the the measured wavelength of the ruby \(R_1\) line (the most energetic one) and \(\lambda_0=694.24\) nm is the zero-pressure value at 298 K1. The relationship between the wavenumber \(\nu\) in cm\(^{-1}\) and the wavelength \(\lambda\) in nm is given by \(\nu(\text{cm}^{-1})=\frac{10^7}{\lambda(\text{nm})}\), and the Raman shift \(\omega=\Delta\nu=\nu_l-\nu=\frac{10^7}{\lambda_l}-\frac{10^7}{\lambda}\) (cm\(^{-1}\)).

Write a function returning a normalized Lorentzian as a function of its center \(x_0\) and its full width at half maximum \(\Gamma\): \[L(x,x_0,\Gamma)=\frac{\Gamma}{2\pi}\frac{1}{\frac{\Gamma^2}{4}+\left(x-x_0\right)^2}\]

Store the list of files containing ruby in their name in the Data/ folder into a variable flist. Print its length.

## [1] 5

Plot with points the first file in flist. Find the position of its maximum and store it in xmax. Guess roughly the parameters needed to fit the experimental data by y0+A1*L(x,x1,FW1)+A2*L(x,x2,FW2), and add a blue line on the plot to represent this function.

Using nls(), fit the first spectrum in flist by y0+A1*L(x,x1,FW1)+A2*L(x,x2,FW2) and using the starting parameters you defined before. Plot the experimental data again and add the fitted spectrum as a red line.

Based on the above procedure, for each file in flist (so, use a for loop), fit the Raman spectrum by the sum of two Lorentzian functions, and store the fitting parameters into a data.frame called ruby_fit also containing the names of the corresponding files. Attention: the initial guesses for amplitudes and widths can be constant, but the peaks positions should evolve for each spectrum. The difference between the two peaks is always roughly 30 cm\(^{-1}\), and the largest peak is always the most energetic one. Check that your fits are correct by printing the experimental data and the fitted result at each iteration (add the name of the file as the plot title).

Add a column in ruby_fit corresponding to the estimated pressure rounded to 1 decimal. The excitation wavelength in this experiment was 532 nm. Print the resulting ruby_fit table using knitr::kable(ruby_fit)

name y0 A1 A2 FW1 FW2 x1 x2 P
SWNT_ruby_01.txt 121.31548 166356.2 327696.3 11.39192 15.06717 4371.581 4401.312 1.1
SWNT_ruby_02.txt 90.62964 152514.6 292086.0 11.45237 14.76490 4391.899 4421.366 3.8
SWNT_ruby_03.txt 92.03126 182837.6 335173.7 11.53816 14.62646 4414.858 4444.203 6.8
SWNT_ruby_04.txt 319.62879 294710.7 577912.4 11.08481 14.16231 4444.792 4474.054 10.9
SWNT_ruby_05.txt 121.25907 155329.7 392016.5 13.02157 17.95192 4482.789 4513.105 16.5

Store all file names containing “RBM” into a variable fRBM. Load all the corresponding spectra into a single data.frame called spec with 3 columns: Raman shift \(\omega\), Intensity, Pressure. Of course, the indexes in the file names between the ruby and RBM files match. In the Intensity column, store the intensity normalized to [0,1].

Using ggplot2, plot with points the stacked normalized RBM band spectra vertically shifted by P, with a color for each spectrum corresponding to the pressure. Make the plot interactive.